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Abstract 

The world’s steady transitioning online has been expedited by the ongoing COVID-19 pandemic. 
Libraries need to swiftly move to the online environment, and the semantic web and linked data 
seem to be a perfect solution. Linked data will help to connect structured data and their 
relationships on the Internet. By linking data, academic libraries will be able to achieve their 
open access mission. Once all library materials, including open access publications are linked on 
the semantic web, these resources, representing all human knowledge, will become inter- 
connected. This will enable the description, access, and sharing of human knowledge without 
any barriers, making such knowledge machine-readable and ready for processing by artificial 
intelligence. 
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Linked Data is the Future of Description and Access in Academic Libraries 

With the entire world transitioning online, the future of all libraries cannot be imagined 
without linked data; this is especially true for academic libraries. Linked data is a set of 
processes to connect structured data and their relationships and describe publishing, sharing, and 
connecting data on the web (Baker et al., 2011; Park & Kipp, 2019). In recent years, linking data 
has become one of the top priorities for libraries (Zhu, 2019). In addition to text materials, there 
are image, video, and audio resources, as well as large datasets and raw data (e.g., genome, 
geospatial) that need to be linked to authors/creators, their affiliations, and published articles 
(Fernandez & Tilton, 2018). 

Transitioning from the world wide web to the semantic web is not possible without 
linking data. In fact, linked data is “authority control for the semantic web” (Zhu, 2019, p. 216) 
and “the heart of what semantic web is all about” (W3C, n.d.). The semantic web may be defined 
as an environment in which all existing data could be pulled by simple queries and are machine- 
readable (W3C, n.d.). Linking data is becoming more important-so important, in fact, that new 
terms have been introduced such as “linked open data” and “library linked data” (Baker et al., 
2011, para 1). 

One of the greatest benefits of linking data is that it will help librarians to meet one of 
their ultimate goals, described in the United Nations Universal Declaration of Human Rights, 
specifically, “the right to know and to be informed” (Ford, 2018, p. 267). It will also help to 
reduce and ultimately eliminate inequality in information and digital literacy. Linked data is not 
an easy task that can be achieved overnight. Librarians need to examine all available metadata 
tools and techniques used to make published research visible, discoverable, and usable 


(Fernandez & Tilton, 2018). 
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Linked Data and Metadata Tools 

Many libraries currently use MARC schema; however, MARC cannot accommodate 
linking data and is not suitable for linking data needs (Zhu, 2019). Therefore, another schema as 
well as describing formats are needed, a schema that supports formats required for linking data. 
In fact, the transition to the linked data has already begun; the MARC schema is now considered 
a legacy system while libraries transitioning to the linked data. 

In the MARC schema, data are registered by the context of the record, identifiers are text 
strings, and relationships are presented as notes. In linked data, things are linked as independent 
from their context of records and identifiers and relationships are recorded as uniform resource 
identifiers (Zhu, 2019, p. 218). Several describing formats may serve as the MARC’s 
replacement with the best suitable candidates being URIs (uniform resource identifier), RDF 
(resource description framework), SPARQL (protocol and RDF query language), BIBFRAME, 
OWL, and linking data in WorldCat (Fernandez & Tilton, 2018; W3C, n.d.). 

BIBFRAME that is supported by Library of Congress and Zepheira is helpful during the 
transition from the MARC records over to linked data. However, BIBFRAME heavily relies on 
Wikipedia, which is biased (Wikipedia, n.d. —a). Wikipedia itself makes a disclaimer that their 
information “should not be considered a definitive source in and of itself’ (Wikipedia, n.d. —b, 
para 1). Wikipedia is a crowdsourcing-based platform and its records are created by ordinary 
people who have different education, interests, believes, and biases. If there are any introduced 
errors, these may take “days, weeks, months, or even years” to correct (para 1). In addition, 
many references have dead or broken links, making these sources unreliable. 

The most promising resource type for the semantic web and thus linking needs is 


resource description framework (RDF) as it is schema-neutral. The RDF is based on finding 
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common attributes in other resource types with the purpose to connect to them (Taylor & 
Joudrey, 2018, p. 293). It is the only schema that could potentially connect all other existing 
schemas and could be completed without the need for re-coding of existing records. The process 
includes creation of metadata in RDF, storage, design of URIs, and interlinking them to other 
URIs as well as to common fields in other datasets (Zhu, 2019, p. 220). The latter is to make 
more records visible and discoverable. Lincoln (2015) believes the SPARQL is useful in 
translation of complicated graphs to simple tables that could be readable by the Excel software 
and suggests that its combination with RDF makes it a powerful tool in linking data. In addition, 
RDA entities are transferrable to BIBFRAME, except one of the core entities “expression” that 
does not have an equivalent in BIBFRAME (Zapounidou et al., 2019, p. 280). According to the 
managing director (Hennellly, n.d.), the RDA toolkit will make linking a lot easier as links could 
be built to not only to the MARC and other schemas records, but may stand on its own and be 
linked to the RDA rules using RDA URIs. This promising tool may allow fast and smooth 
linking of all resource data. 

There are the factors requiring special attention when linking data such as human 
(Schilling, 2012, p. 8), privacy, security (Kirrane et al., 2018, p. 153), infrastructure, policy- 
related issues, and process integrity (Harron et al, 2017, p.7). Additional resources are also 
required for these as well as for data quality assurance (Schilling, 2012, p. 8), correcting typos, 
linking to outdated sources, correcting relationships, fixing errors created during the machine- 
reading entry process (Harron et al, 2017, p. 7). To summarize, to properly link, the following 
four requirements described by Berners-Lee (2016) need to exist: 1) URIs for things, 2) HTTP 


for look up, 3) RDF or SPARQL for description, and 4) links for connecting to other URIs. 
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Despite the need for additional human and infrastructure resources, the end product-global 
network of all human knowledge-is invaluable for libraries and the humankind. 
Linked Data in Academic Libraries and Open Access 

While linking of all library resources is an ultimate goal to create their discoverability on 
the semantic web (Zhu, 2019, p. 215), academic libraries tasked with the implementation of open 
access cannot do it successfully without the ability to link data. 

Academic librarians work hard to make all published research results open to the public. 
Open access is defined as immediate and unlimited public’s access to all published research 
results. The Open Access Movement (Budapest Open Access Initiative, n.d.) has been around for 
almost twenty years, however, it is yet to reach its full potential. The reason for it is that both 
librarians and academic researches face various barriers. The ultimate mission of open access is 
not only to achieve access to all published research materials but also to “open science” 
(Piazzini, 2020) and “open education” (Smith & Dickson, 2017). These would be achieved by 
allowing everyone to have unrestricted rights to access, read, save, print, and reuse research 
results immediately upon publication (Crawford, 2011; Smith & Dickson, 2017). The estimated 
numbers of closed or non-open access articles varies from a quarter to three quarters published 
articles and these numbers depend on a discipline (Fruin, 2019; Mikki, 2017; O’ Hanlon, et al., 
2020). 

Gold and green open access routes are closest to the ultimate open access goal as though 
costly for authors, these are free for users to access (Finlay, 2019; Kelly, 2013; Sotudeh & 
Estakhr, 2018). In attempt to meet the requirements of the open access mission, librarians created 
the green route by developing institutional repositories where authors could upload their articles 


(Bedord, 2018, p. 63). Canadian Association of Research Libraries (n.d. —a) called an 


LINKED DATA IS THE FUTURE IN ACADEMIC LIBRARIES 


institutional repository as a “digital archive of an institution’s intellectual output” (n.d. -a). Ways 
to locate articles “populated with content metadata” (Finlay, 2019, p. 6) in institutional 
repositories included DOIs (digital object identifiers) (DOI, n.d.) and open uniform resource 
locators (URLs), with latter one being accepted as the National Information Standards 
Organization (NISO) standard (p. 7). However, Finlay (2019) found that researchers preferred to 
upload their articles elsewhere and not in institutional repositories. This was because these 
researchers believed their articles stored in institutional repositories remained undiscoverable in 
there. Instead, researchers uploaded their articles in ResearchGate, SciHub, and academic social 
networks (Finlay, 2019). By doing this, they were able to reach larger auditorium. The downside 
of this so-called black open access route was that this sharing practice was illegal. 

In June 2020, the Canadian Association of Research Libraries (n.d. -b), realizing the 
limitations of current institutional repositories, have started a new initiative that would allow 
connecting all existing repositories into one central repository. This great initiative, however, 
will only work if all data are linked. I used the article written by Zhang and Watson (2017), the 
University of Saskatchewan’s academic librarians as an example. This study of physical sciences 
researchers funded by the Canadian Institutes of Health Research identified that 87%-91% 
publications were closed publications. This open access (and thus open to the public) article for 
some time was undiscoverable on the web. It would not come up in my searches unless I checked 
for it on the University of Saskatchewan’s website, in the research archive called HARVEST 
created on the DSpace platform (University of Saskatchewan, n.d. -a). However, once the URI 
(http://hdl.handle.net/10388/8089) was generated for this record (and relationships were created), 
this article became discoverable on the web. A simple Google search brought up this article. This 


article’s characteristics also linked this article to other valuable resource materials. 
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While open data are more about the legality to openly share data, linked data are about 
the technical side of the process (Baker at al., 2011). Once an open access article is created to 
follow the linked data lifecycle, it becomes discoverable. This leads to increased visibility of 
published articles, expanded collaboration among researchers, improved research productivity, 
and growing number of citations. Once all library materials including open access publications 
are linked to the semantic web, these resources representing the entire human knowledge will be 
machine-readable and available for processing by artificial intelligence (Zhu, 2019, p. 215). 
Once data are open and inter-connected, academic libraries will be able to achieve its open 


access mission to enable access and sharing of knowledge without any barriers. 
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